Minimizing Regret in Discounted-Sum Games
نویسندگان
چکیده
In this paper, we study the problem of minimizing regret in discounted-sum games played on weighted game graphs. We give algorithms for the general problem of computing the minimal regret of the controller (Eve) as well as several variants depending on which strategies the environment (Adam) is permitted to use. We also consider the problem of synthesizing regret-free strategies for Eve in each of these scenarios.
منابع مشابه
Solving Two-Player Zero-Sum Repeated Bayesian Games
This paper studies two-player zero-sum repeated Bayesian games in which every player has a private type that is unknown to the other player, and the initial probability of the type of every player is publicly known. The types of players are independently chosen according to the initial probabilities, and are kept the same all through the game. At every stage, players simultaneously choose actio...
متن کاملLearning Nash Equilibrium for General-Sum Markov Games from Batch Data
This paper addresses the problem of learning a Nash equilibrium in γ-discounted multiplayer general-sum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although, several techniques were found for the subcase of ...
متن کاملGames with vector payoffs : a dynamic programming approach
Games with Vector Payoffs : A Dynamic Programming Approach by Vijay Sukumar Kamble Doctor of Philosophy in Engineering – Electrical Engineeing and Computer Sciences University of California, Berkeley Professor Jean Walrand, Chair In several decision-making scenarios in adversarial environments, a decision-maker cares about multiple objectives at the same time. For example, in certain defense op...
متن کاملMinimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games
We study a general adversarial online learning problem, in which we are given a decision set X in a reflexive Banach space X and a sequence of reward vectors in the dual space of X . At each iteration, we choose an action from X , based on the observed sequence of previous rewards. Our goal is to minimize regret. Using results from infinite dimensional convex analysis, we generalize the method ...
متن کاملMinimizing regret : the general case . ¤ Aldo
In repeated games with di®erential information on one side, the labelling \general case" refers to games in which the action of the informed player is not known to the uninformed, who can only observe a signal which is the random outcome of his and his opponent's action. Here we consider the problem of minimizing regret (in the sense ̄rst formulated by Hannan [8]) when the information available...
متن کامل